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1 . INTRODUCTION 


A siabs tantial amount of software engineering research effort 
has been focused on the development of software cost estima- 
tion models. A concensus (of sorts) has emerged on that 
topic. The following relationship is widely accepted: 

H s = aL b (1) 

where H = staff-hours of effort 
s 

L = lines of code 
a = a constant 
b = a constant 

The Software Engineering Laboratory (SEL) has devised a 
measure of lines of code based on the origin of the delivered 
code that is substituted in the equation above. This is 

L dev * N + E + 0,2 (S+0) (2) 

where L^ ev . = "developed" lines of code 

N = newly implemented lines of code 
E = extensively modified lines of code 
S = slightly modified lines of code 
O = old (unchanged) lines of code 

Equation 1 using "developed" lines of code has given good 
results as an estimator of development effort. (The anal- 
yses in this document are based on a sample of 20 ground- 
based attitude systems) . Table 13 shows a regression analy- 
sis that produced a correlation of 0.99 and an estimate of 
b of 1.1 when the value of a was fixed at 1.0 in Equation 1. 
Despite these encouraging results, this model has two sig- 
ificant limitations. These are the following: 

• The substantial amount of development work done in 
activities other than code implementation may not be 
adequately considered in the lines of code measure. 
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• The lines of code, whether "delivered" or "developed", 
is not known accurately until late in the development 
cycle when accurate estimates are less useful. 

The purpose of this memorandum is to discuss these limita- 
tions and to propose some alternative estimation models that 
can be used earlier in the development process, e.g., during 
requirements analysis and preliminary design. 

2. MODELS OF WORK 

The obvious alternative to lines of code as a measure of the 
work done is pages of documentation. Although only a por- 
tion of a software development team is involved in coding, 
almost everyone produces some documentation. This includes 
requirements, design, and operations documents. Table 1 com- 
pares the components of developed lines of code with pages 
of documentation as estimators or programmer hours . A re- 
gession model based on the two most strongly correlated 
measures is described in Table 2. This model showed the 
following relationship: 


Hp = 0.056 N + 4.15D (3) 

where Hp = programmer hours 

N = newly implemented lines of code 
D = pages of documentation 

A similar comparison is made in Table 3 for these measures 
as estimators of staff-hours (including programmer, manager, 
and other hours) . A regression model based on the two most 
strongly correlated measures is described in Table 4 . This 
model showed the following relationship: 

H = 0.051 N + 7.10D (4) 

w 

where H = staff-hours 
s 

N = newly implemented lines of code 
D = pages of documentation 
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The correlation coefficient (r) associated with each of the 
relationships expressed in Equations 3 and 4 was 0.97, com- 
parable to that obtained by substituting Equation 2 for L in 
Equation 1. These results suggest that the best measures of 
work done are lines of new code and pages of documentation. 
Reused lines of code do not seem to contribute directly to 
resource expenditures. However, the requirements analysis 
and design effort involved in reusing previously developed 
code may be included in the pages of documentation measure . 

Although pages of documentation appears to be an important 
measure of work, it has the same limitation as lines-of-code 
measures. Pages of documentation cannot be determined accur- 
ately early in the development cycle. The next sections dis- 
cusses some other measures that can be used to develop models 
for early estimation of resource expenditures and program 
size. 

3. ' MODELS FOR EARLY ESTIMATION 

Few objective measures are available early in the software 
development process. The following five measures were con- 
sidered in this analysis: 

• Number of subsystems - requirements analysis 

• Number of data sets - preliminary design 

• Complexity (PRICE-S) - preliminary design 

• Number of new modules — detailed design 

• Number of reused modules (extensively modified, slightly 
modified, and old) - detailed design 

The following sections discuss the use of these measures for 
early estimation of program size and resource expenditures. 
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3 . 1 PROGRAM SIZE 


The correlations of the measures described here with deliv- 
ered lines of code are compared in Table 5. Three regression 
models were developed (Tables 6, 7, and 8). The two most 
useful of these are the following: 


where 



L del = 7596 S 
L del = 168N + 195R 

= delivered lines of code 
= number of subsystems 
= number of new modules 
= number of reused modules 


(5) 

( 6 ) 


Equation 5 (r = 0.99) defines an estimating relationship for 
program size that can be used during the requirements analy- 
sis phase. Equation 6 (r = 0.98) defines an estimating re- 
lationship of comparable reliability that can be used during 
the design phase. 

3.2 RESOURCE EXPENDITURES 

The correlations of the measures described here with staff- 
hours of effort are compared in Table 9. Three regression 
models were developed (Tables 10, 11, and 12). The two most 
useful of these are the following: 

H s = 1634 S (7) 

H = 45 N + 28 R (8) 

s 

where H = staff-hours 
s 

S = number of subsystems 
N = number of new modules 
R = number of reused modules 

Equation 7 (r = 0.93) defines an estimating relationship for 
resource expenditures that can be used during the require- 
ments analysis phase. Equation 8 (r = 0.94) defines an 
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estimating relationship of higher reliability that can be 
used during the design phase. 

4. CONCLUSION 

The preceding analysis has demonstrated two important points. 
These are the following: 

• New measures of productivity which incorporate other 
development products besides lines of code must be in- 
vestigated. Pages of documentation is a good candi- 
date . 

• Effective estimates of program size and resource ex- 
penditures can be made using measures that are avail- 
able early in the development cycle. 
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Table 1. Components of Programmer Effort 
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Table 9. Comparison of Early Resource Estimators 
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UMMARY STATISTICS FOR MEASURES 
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Figure 1. Relationship of Modules to Size 
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Figure 3. Relationship of Modules to Total Staff Effort 
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Figure 4. Relationship of System to Total Staff Effort 



NOTE: I OBS HIDDEN 


